Using Fast Subgraph Isomorphism Checking for Protein Functional Annotation Using Scop and Gene Ontology
نویسندگان
چکیده
We describe a method for protein family identification using a graph representation of proteins. The method incorporates a novel fast subgraph isomorphism method based on a graph index to query a new structure for occurrences of family fingerprints and to assign it to a protein family with a confidence value. This method can provide an independent assignment of the protein family for a new structure in silico, in cases where sequence alignments and structural matches fail to provide proper annotation. Using Gene Ontology and cross validation, we further validate the annotation power of the mined fingerprints.
منابع مشابه
Structure-based function inference using protein family-specific fingerprints.
We describe a method to assign a protein structure to a functional family using family-specific fingerprints. Fingerprints represent amino acid packing patterns that occur in most members of a family but are rare in the background, a nonredundant subset of PDB; their information is additional to sequence alignments, sequence patterns, structural superposition, and active-site templates. Fingerp...
متن کاملA structural alignment kernel for protein structures
MOTIVATION This work aims to develop computational methods to annotate protein structures in an automated fashion. We employ a support vector machine (SVM) classifier to map from a given class of structures to their corresponding structural (SCOP) or functional (Gene Ontology) annotation. In particular, we build upon recent work describing various kernels for protein structures, where a kernel ...
متن کاملAssigning new GO annotations to protein data bank sequences by combining structure and sequence homology.
Accompanying the discovery of an increasing number of proteins, there is the need to provide functional annotation that is both highly accurate and consistent. The Gene Ontology (GO) provides consistent annotation in a computer readable and usable form; hence, GO annotation (GOA) has been assigned to a large number of protein sequences based on direct experimental evidence and through inference...
متن کاملIdentification of family-specific residue packing motifs and their use for structure-based protein function prediction: I. Method development
Protein function prediction is one of the central problems in computational biology. We present a novel automated protein structure-based function prediction method using libraries of local residue packing patterns that are common to most proteins in a known functional family. Critical to this approach is the representation of a protein structure as a graph where residue vertices (residue name ...
متن کاملIdentification and prioritization genes related to Hypercholesterolemia QTLs using gene ontology and protein interaction networks
Gene identification represents the first step to a better understanding of the physiological role of the underlying protein and disease pathways, which in turn serves as a starting point for developing therapeutic interventions. Familial hypercholesterolemia is a hereditary metabolic disorder characterized by high low-density lipoprotein cholesterol levels. Hypercholesterolemia is a quantitativ...
متن کامل